DEMB: Cache-Aware Scheduling for Distributed Query Processing

نویسندگان

Junyong Lee

Youngmoon Eom

Alan Sussman

Beomseok Nam

چکیده

Leveraging data in distributed caches for large scale query processing applications is becoming more important, given current trends toward building large scalable distributed systems by connecting multiple heterogeneous less powerful machines rather than purchasing expensive homogeneous and very powerful machines. As more servers are added to such clusters, more memory is available for caching data objects across the distributed machines. However the cached objects are dispersed and traditional query scheduling policies that take into account only load balancing do not effectively utilize the increased cache space. We propose a new multi-dimensional range query scheduling policy for distributed query processing frameworks, called DEMB, that employs a probability distribution estimation derived from recent queries. DEMB accounts for both load balancing and the availability of distributed cached objects to both improve the cache hit rate for queries and thereby decrease query turnaround time and throughput. We experimentally demonstrate that DEMB produces better query plans and lower query response times than other query scheduling policies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiple query scheduling for distributed semantic caches

In distributed query processing systems, load balancing plays an important role in maximizing system throughput. When queries can leverage cached intermediate results, improving the cache hit ratio becomes as important as load balancing in query scheduling, especially when dealing with computationally expensive queries. The scheduling policies must be designed to take into consideration the dyn...

متن کامل

Cooperative caching for grid-enabled OLAP

In this paper, we propose a grid-based On-Line Analytical Processing (OLAP) application which distributes query computation across an enterprise grid. Our application follows a two-tiered process for answering queries based on sharing Cached OLAP data between the users at the local grid site and using grid scheduling approaches to execute the remaining parts of a query amongst a distributed set...

متن کامل

EM-KDE: A locality-aware job scheduling policy with distributed semantic caches

In modern query processing systems, the caching facilities are distributed and scale with the number of servers. To maximize the overall system throughput, the distributed system should balance the query loads among servers and also leverage cached results. In particular, leveraging distributed cached data is becoming more important as many systems are being built by connecting many small heter...

متن کامل

Efficient Distributed Top-k Query Processing with Caching

Recently, there has been an increased interest in incorporating in database management systems rank-aware query operators, such as top-k queries, that allow users to retrieve only the most interesting data objects. In this paper, we propose a cache-based approach for efficiently supporting top-k queries in distributed database management systems. In large distributed systems, the query performa...

متن کامل

An Optimizing Query Processor with an Efficient Caching Mechanism for Distributed Databases

This paper provides an efficient way of querying among many distributed and heterogeneous data sources. We describe a database optimization framework that supports data and computation reuse, query scheduling and caching mechanism to speed up the evaluation of multiquery workload. The Caching query result is stored as an eXtensible Markup Language (XML) document. An XML oriented common data mod...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

DEMB: Cache-Aware Scheduling for Distributed Query Processing

نویسندگان

چکیده

منابع مشابه

Multiple query scheduling for distributed semantic caches

Cooperative caching for grid-enabled OLAP

EM-KDE: A locality-aware job scheduling policy with distributed semantic caches

Efficient Distributed Top-k Query Processing with Caching

An Optimizing Query Processor with an Efficient Caching Mechanism for Distributed Databases

عنوان ژورنال:

اشتراک گذاری